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Summary 

A high speed low cost digital signal processor, COPAS 2, has been designed 
and built suitable for real-time operations on high quality audio signals. In two 
companion Reports, the theoretical background to the digital techniques used in 
realising equalisers and dynamic range control devices are described in detail. From 
this work it was possible to assess and optimise the architecture of a processor to 
carry out these processes. 

It was found useful to add special purpose circuitry for conversion of fixed- 
point to floating-point number representations and to calculate logarithms and 
antilogarithms. These are necessary to increase the dynamic range of signals within 
digital equalising filters and to provide the static characteristics of dynamic range 
control devices. Real-time control of the processor is achieved by switching program 
segments in or out, and all commands and control data are transferred using a 
standard interface bus, IEEE Std.488. 

A new microinstruction set is defined for this processor and comprehensive 
software and hardware debugging aids are incorporated to improve reliability. 
The highly modular nature of the processor makes it simple to build into a large 
system and methods are described for flexible routing of the digital audio signals 
and the connection of COPAS units to each other and other processing units. 
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1. 



Introduction 



This Report is one of a group of three 
describing the design and application of a digital 
signal processor for high quality audio signals. 
Here the hardware and system aspects of such 
processing are considered in detail, with special 
attention to a second generation design known as 
COPAS-2.* This processor is a development of a 
previous design (COPAS-1),^ and has been 
optimised for applications in recursive digital 
filtering and dynamic range control. The two 
companion Reports'^'"' describe in detail the 
processing requirements for these applications and 
show the necessity for the following capabilities in 
COPAS-2. 

1. 24-bit internal wordlength. 

2. Floating-point multiphcation but fixed-point 
addition. 

3. Fast context switching. 

4. Simple control via a standard bus. 

5. Fast calculation of logarithms and 
antilogarithms. 

6. Comprehensive diagnostics and self-testing. 



In addition, a modular approach was required 
to simplify the task of inter-connecting many 
identical units. This is particularly relevant to the 
design of mixing equipment where there is a need 
for a large number of similar, parallel processing 
paths. In this Report, the techniques for 
incorporating these facilities are described. 

The objectives of the design were to compute 
at least four high-quality second-order filters and 
provide full facilities for dynamic range control, 
gain control, etc. at any practical audio sampling 
rate up to 50.4 kHz. The available processing 
power would be allocated efficiently by the use of 
context switching, and the use of a standard bus 
for control would permit 'high level' messages to 
be transmitted between processors and a central 
system controller. This would also simplify the 
self-test and diagnostic software since error 
messages could be interpreted by this controller. 

*COPAS is an acronvm for Computer for Processing Audio Signals. 



2. Architectural features of COPAS-2 

2.1. Choice of Processor 

The basic building block of the processor is 
the Am2901B bit-shce. Six of these circuits are 
connected together, providing a 24-bit wordlength 
arithmetic logic unit (ALU) with 16 registers and 
shift functions. The device has a convenient 
architecture for signal processing work particularly 
for applications with relatively simple algorithms 
which are required to operate at very high speed. 
The 16 on-chip registers can be used as the delay 
elements in digital filters, memory address registers 
or general purpose accumulators, etc. By installing 
these devices in a microprogrammed system, all the 
arithmetic and logic functions for a signal 
processor can be obtained, though, as has been 
shown in earlier work, it is beneficial to install 
additional hardware to increase processing speed. 

In the previous COP AS design, a different 
bit-slice was used because it was more highly 
integrated and provided several extra facilities, 
including twoseparately controlled address counters. 
However, the device is not well supported by 
software development aids and has not been 
developed to higher speed versions. It is also more 
difficult to program. Simpler ALUs were considered 
but in each case, some extra logic would have had 
to be added to provide the basic signal processor 
functions. The Am2901 was therefore chosen and 
the highest speed version available at the time, the 
Am2901B, was used. The design was based on a 
cyde time of 140ns, though with future 
developments in mind (such as the Fairchild F2901 
which uses ECL technology internally and FAST^"^ 
logic to interface), cycle times approaching 100ns 
may be possible without altering the design. 

There are several other useful facilities pro- 
vided by the Am2901B processor. Add-and-shift 
logic is provided so that microprogrammed multiply 
can be executed with a full precision product; the 
ALU can be bypassed for fast access to working 
registers reducing the access times required for 
memory addressed by these registers; and two 
independent sources can be accessed simultaneously 
for the ALU, thus saving machine cycles. This latter 
feature is of great use in implementing the % ^ 
function for digital filters type 2AB^ and the method 
is illustrated in Fig. 1. The filter delay is provided 
by one of the working registers of the Am2901B and 
is selected with the input as sources to the ALU. The 
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arithmetic addition is provided in the ALU and the 
result written back into the same working register 
while simultaneously being available at the device 
outputs. AJl these operations occur during a single 
140ns machine cycle. 

2.2. Internal Bus Structure 

For efficient implementation of audio signal 
processing algorithms, particularly digital filtering, 
a high speed hardware multiplier has been found to 
be necessary. In COPAS-2 this is incorporated 
with the bit-slice elements in a three bus architecture 
shown in Fig. 2. The three main data highway are: 

1. from the bit-slice elements output to one 
multiplier input — Y bus. 

2. from a coefficient memory to the other 
multipher input — C bus. 

3. from the multiplier output to the bit-slice 
elements input — M bus. 

This arrangement ensures that for many operations, 
both the bit-slice elements and the hardware 
multiplier are kept busy. Typically, the multiplier 
and multiplicand can be loaded simultaneously into 
the multiplier, or a previously calculated product 
can be accessed while loading a new multiplier. 
However, all three operations cannot be carried out 



simultaneously because the lowest order bits of the 
product are obtained from a bi-directional port used 
for the multiplicand, (see Section 3.1.) This 
difficulty can be minimised by arranging that the 
multiplicand does not change frequently, which is 
easily done for some recursive fdter structures^. The 
products from the multiplier are truncated to 24 bits, 
suitable for input to the six bit-slice elements. 
This wordlength was selected as a result of 
simulations of the effect of roundoff noise in 
the digital filters to be implemented ^. However, 
the inputs to the multiplier have only 16 bits 
and for this reason the data busses into the 
multiplier are restricted to this wordlength. 

A single register bypasses the multipher so 
that further parallel processing can take place whilst 
the multiplier is busy. This is typically used for 
transferring data directly to the bit-sUce elements 
or for routing data from the bit-slice elements to 
the output ports. 

For many applications, additional storage is 
required for audio data, such as a preliminary delay 
for limiters'', or as a delay to compensate for non- 
optimum microphone placement. A second memory 
is provided which is addressed by thebit-slice elements 
and can, in principle, be extended to 2^* words . 
In practice, between 256 and 4096 words is 
sufficient on-board storage and this is provided 
in the delay memory shown in Fig. 2. Writing 
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Fig. 2 - Programmers block diagram of COPAS-2. 



operations require two machine cycles since the 
data must firstly be set up in the memory data 
register (MDR). This area of the processor is 
probably the least exploited, for example, one 
design modification currently being considered is 
to arrange this memory as memory-mapped 
output, with this configuration, large blocks of 
data could be transferred into the memory from 
external devices under external control, while 
COPAS processes the data uninterrupted by input/ 
output operations. With the simpler arrangement 
shown in Fig. 2, short audio delays can easily be 
programmed and this was seen to be the first 
application, 

Up to 16 input and output ports can be con- 
nected as shown in Fig. 2. The input and output 
circuits automatically transfer data to or from 
multiplexed data busses A and B and permit up 
to eight identical COPAS-2 units to be connected 
together. Routing audio data from one unit to 
another is then easily carried out by reordering 
the multiplexed data sequence, a process described 
further in Section 4.1. Input and output 
wordlengths are 16 bits, but as is the case with all 
the internal data busses, the wordlength can be 
easily increased without changing the architecture 
or programming. 

2.3. Control Bus 

The control bus has been designed in 
accordance with an IEEE Standard^'^ which 



describes a parallel data bus by which up to 15 
devices may be interconnected. Communication 
is between 'talkers' and 'listeners' where in the 
bus context a talker is a device enabled to send 
data over the bus and a listener is one enabled 
to receive data. A controller is required to 
designate which device is to talk and which devices 
are to listen. A controller is also capable of 
sending special types of data (e.g. universal 
commands) to all devices connected to the bus 
and receiving status data from other devices. 

In COPAS-2 an allowable subset of interface 
functions is used and listed in Table 1. 
This level of compatibility is adequate for all the 
communications discussed in this Report and 
forms the base level of the operating software of 
the MPU which is machine independent. The 
second level of software is machine dependent and 
involves the decoding of messages into internally 
recognisable routines. A list of messages for 

COPAS-2 and the corresponding action which is 
taken is given in the two companion reports ^'^. 

In general, communication between 
controller and COPAS-2 (and other devices in this 
modular system) is by means of records of up to 
64 characters. These records are transferred to or 
from data buffers directly via the GPIB, with the 
result that the bus is not held up while the MPU 
interprets the messages. In a system with many 
devices this ensures that the controller can service 
each device in the shortest possible time. If the 



Ident. 


Interface Function Capability 


SHI 


Source handshake 


AHl 


Acceptor handshake 


T2 


Basic talker, serial poll 


L2 


Basic listener 


SRI 


Service request 


RLO 


No capability, remote/local 


PPO 


No capability, parallel poll 


DCO 


No capability, device clear 


DTO 


No capability, device trigger 



(PH-236) 



Table 1 - COPAS-2 GPIB Interface Capabilities 
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MPU fails to interpret a message then error 
messages are generated. When the error message 
has been loaded in the output buffer, a 'service 
request' is initiated and the controller carries out a 
serial poll of all devices in the system to discover 
which device originated the request. By these 
means, the possibility of one device 'hanging 
up' is reduced and the controller can take appro- 
priate action to either correct the difficulty or alert 
the user to a fault condition. 

2.4. Multiply-Shift— Accumulate Hardware 

In processing digital audio signals, care must 
be taken to avoid cumulative roundoff errors 
impairing the noise performance of the output. A 
particular problem occurs when a fixed-point 
hardware multiplier is used because the wordlength 
at the multiplier inputs is necessarily restricted. 
With the 16-bit inputs of the multiplier used, 
the TRW MPY-16HJ, and 16-bit quantisation of 
the audio inputs, it has been shown that it is 
necessary to use a floating-point technique^. 
However, for addition, it is convenient to supply 
additional bits in an extended fixed-point 
accumulator. 

It is, therefore, necessary to provide 
fixed-point to floating-point conversion between 
addition and multiply operations, and 
floating-point to fixed-point conversion between 
multiply and addition operations. The former is 
more complex than the latter, for which additional 
hardware can provide a relatively simple solution. 
However, it was decided to do fixed-point to 
floating-point conversion in software in the light of 
work done on digital filter structures^ . This 
showed that apart from a conversion at the 
processor input, fixed-point to floating-point 
conversion is necessary only once at each filter 
output. With a five instruction microsubroutine 
a 16-bit mantissa and 2-bit exponent can be 
generated, though a 3-bit exponent can also 
be managed. For the single filter examples investi- 
gated, a 2-bit exponent was sufficient. 

The hardware for floating-point to 
fixed-point conversion is illustrated in Fig. 2. 
The exponent of the product is calculated and 
stored in a register in the control logic and is 
used to arithmetically right shift* between zero 
and seven places before the result is transferred to 
a 24-bit fixed-point ALU. This operation is carried 
out five times in each biquadratic section filter and 
thus makes the additional hardware worthwhile. 
We shall see in Section 2.5. that this hardware is 
also required for other processes. 
* a binary shifting operation which preserves sign. 



2.5. Log-Antilog Hardware 

In the companion Report on dynamic range 
control ^ it is shown that logarithms and antilo- 
garithms must be calculated to implement the 
static characteristics of variable-gain devices. In 
the same Report a method is discussed for 
expressing a number in floating-point format to 
simplify the task, since with 16-bit quantised 
inputs, a look-up technique is impractical. The 
realisation of this technique is shown in block 
diagram form in Figs. 3 and 4 in which linear 
numbers are expressed as positive fractional binary 
numbers and, with appropriate scaling, the 
corresponding logarithms (base 2) are therefore, 
negative, fractional binary numbers. 

In generating the logarithm of a number, the 
sign bit of the linear input, the MSB, was discarded 
and the next eight bits are examined in the priority 
encoder to find the position of the first logic '1'. 
This position provides the exponent in a 
floating-point representation of the linear input 
and is used to control a shifter (0 to 7 places) 
which then ouputs the mantissa. A 12-bit 
logarithm is then easily generated by selecting a 
'1' for the MSB since the output will be negative; 
the logarithm of an exponent provides 3 bits 
directly and a look-up table (a) provides the last 
8 bits. Bit 11 of the output is fixed at logic 'T 
unless the priority encoder failed to find a logic 
'1' in the eight bits of the input. In this case, 
bit-1 1 is set to '0' and look-up table (b) is used to 
generate all the remaining bits of the output. Thus 
for small inputs less than 2~^ , the number of 
valid bits in the output decreases to a minimum of 
ten. As explained in Ref.3 this was found to be 
the best compromise between accuracy and 
hardware/software costs. 

This processing is readily incorporated 
in COPAS-2 as shown in Fig. 2. Some reasons for 
incorporating a shifter have already been 
explained, but the device can now be additionally 
used for log/antilog calculations. For logarithms, 
the control logic must act on the outputs of a 
priority encoder and the resulting mantissa is 
passed through the ALU and addresses the PROM 
look-up tables. In practice, a maximum of 
7 microinstructions is required to calculate a 
logarithm. If increased accuracy were required at 
low signal levels, a four bit exponent could be 
derived and additional shifting operations 
performed in the ALU. However, to maintain 
13-bit accuracy for low signal levels then requires 
16 microinstructions. 

The calculation of antilogarithms is simpler 
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(Fig, 4), with bits 8 to 11 of the input generating 
to 15 shifts in two passes of the hardware shifter. 
This look-up table is also addressed by the ALU 
and the conversion takes 6 microinstructions. 

2.6. Coeff icrent and Detay Memories 

These two memories are broadly for the 



storage of coefficients and audio data respectively 
(Fig. 2). The coefficient memory provides the data 
link between the signal processing circuits and the 
system controller and is updated or reviewed trans- 
parently by the technique of dynamic data 
exchanged The memory is addressed directly 
from the microinstruction and has a capacity of 
256 words of 16-bits. This has proved adequate 
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in all applications so far considered although it 
could be expanded to 512 words using a 9-bit 
operand field. Expansion beyond this would 
require modification to the microinstruction 
format. 

The delay memory is addressed by the 
bit-slice ALU and can, therefore, be expanded, 
if loading considerations are adhered to, to 
notionally 2 "^"^ words. However, the main purpose 
of this addressing method is to simplify the 
generation of address sequences for delays and FIR 
filters. In fact, only 256 words were incorporated 
in the prototype processor because the main 
application of the memory was to provide a 'look 
ahead' for a dynamic range limiter. The use of 
COPAS-2 for large FIR filters (> 32 taps) is not 
recommended because the program length 
increases sharply in order to provide the direct 
addressing of the coefficient RAM. However, 
short FIR filters can be accommodated quite 
efficiently. 

A useful application for an expanded delay 
memory lies in the field of array processing. Data 
from a host computer may be transferred directly 
to the memory rather than using the 1/0 ports. 
Using a split memory, COPAS-2 can be processing 
one half while data transfer occurs in the other 
half so providing very high speed data input and 
output. This technique is currently being 
investigated for a non-real-time video processing 
application, 

2.7. Context Switching 

The coefficient memory is one route by 
which the system controller can modify the 
execution of a COPAS-2 program. The 
other way is the use of context switching. The 
processor may be programmed to branch 
according to the status of one of eight lines 
controlled directly from the MPU and hence 
the system controller. When one of these lines 
is set and the COPAS-2 program reaches the 
instruction in which that line is tested, the 
result is to switch the program into a different 
address space. For example, before the status 
line is set, COPAS-2 may be computing a 
digital filter - but after the line is set, the 
digital filter may be bypassed and a 
completely different algorithm computed in its 
place. A major advantage of this technique is 
that program flow can be redirected without 
processing overhead. With a larger micro- 
instruction, the number of context switches 
can easily be expanded, experience revealing 
that more than 8 could be useful. 



3, Programming 

The programming of COPAS-2 has much in 
common with the previous version, COPAS-1, but 
is slightly more complex because of the additional 
special-purpose hardware that is added. Programs 
may be loaded into the writeable control store or 
stored more permanently in PROM, A mix of the 
two is permitted and this can be very useful for 
storing often-used sub-routines. In this way, 
the reassembly of large sections of program can be 
avoided, 

3.1, Instruction Format 

Each microinstruction consists of 56 bits 
organised as 18 fields. The use of overlays has 
been avoided to increase execution speed, but 
most fields are encoded so that the 
microinstruction wordlength does not become 
excessive. The operations that can be achieved 
within each field are shown in Fig, 5, 

(1) LOG/EXP field - these bits select the 
data to be placed on the C bus. Normally, the 
outputs of the coefficient memory are enabled, 
but using this field, data from the 2901B outputs, 
the logarithm look-up table or antilogarithm 
look-up table may be selected. 

(2) MDR - this bit controls a latch at the 
output of the 2901Bs, Data held in this latch 
(a Memory Data Register) may be written into the 
delay memory during a subsequent microcycle 
when the 2901 B has calculated a memory address. 

(3) MULT field - controls the hardware 
multiplier. Although organised in a 3 bus system, 
the 16 LSBs of the product share the same bus 
as the Y input. Thus, LDXY and OZIX 
instructions can be supported, but the product 
cannot be obtained while loading the Y input. 

(4) DEST, ALU, SOURCE, CIN, 
D.P.RAM B, D.P.RAM A fields - for further 
information on these fields, the reader is referred 
to the AMD 2901 data book. One 
machine-dependent feature, however, is the use of 
the CIN field to determine the most significant bit 
after right shift operations. 

(5) SHIFT field - The hardware shifter can 
shift the multiplier product to 7 bit positions 
to the right. Alternatively, the 3-bit shift code 
may be derived from a register with offset supplied 
from the microinstruction. 

(6) TEST SEUct, POLarity and NEXT 
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ADDRESS fields - these bits select the required 
status bit or context switch so that branch logic 
can determine the location of the next micro- 
instruction. 

(7) I/O MEM and I/O ADDR fields - this 
encoded field supports a range of mixed input/ 
output and memory control functions. I/O ports 
are specified by the four bits of the address field 
and as long as no bus conflicts occur, other input/ 
output/memory functions can be designed. 

(8) LDI and IMM fields - control the loading 
and reading of the "immediate data register". 

(9) OPERAND field - a 9-bit field used to 
address coefficient memory, supply the next 
address or loop count. 

3.2. The Cross-assembler 

Within each field shown in Fig. 5, each valid 
bit combination is assigned a mnemonic to be 
used while writing programs. Programs generated 
using this set of mnemonics are translated into 
binary object code by a cross-assembler, in this 
case a commercially available software package 
called AMDASM which runs on a microprocessor 
development system. 

Using the assembler consists of two distinct 
phases - in the first the user defines the micro- 
instruction format, mnemonics and constant names; 
the second phase operates on a program which uses 
the mnemonics, constants, etc., specified in the 
first phase to produce the binary output, various 
listings and cross-reference tables. 

An additional program was written so that 
the binary output file could be transferred directly 
to the COPAS-2 writeable control store, and then 
immediately executed. For further details of the 
cross-assembler, the reader is referred to the 
Advanced Micro Devices literature on AMDASM. 

3.3. Diagnostics and Self-testing 

For a complex processor such as COPAS-2, 
the ability to detect and locate errors in hardware 
and software must be considered at an early stage 
of design. This is particularly true for a modular 
design which might ultimately be used in large 
numbers as in a digital mixing console. 

The first task is clearly to verify that the 
hardware is functioning correctly. A number of 
software routines were written to do this and 
additionally to indicate the nature of a fault 



when found. The only means of communication 
with COPAS-2 is via the IEEE488 bus and this is 
used as the starting point for the self-test. Firstly, 
communication with COPAS-2 is confirmed by 
forcing the MPU to return a status message on the 
GPIB. If this is satisfactory, the coefficient 
memory is tested by repetitively storing and 
checking a pseudo-random word sequence. A link 
with the signal processing circuits is then 
established, and these circuits may be tested by 
writing test results into the coefficient memory 
where they are checked for correctness by the 
local MPU. Again, the approach is to confirm 
correct operation of each hardware sub-system 
before using that sub-system in a subsequent 
test. Thus, the first test on a signal processing 
circuit sets the bit-slice outputs to zero and by 
connecting the Y bus with the C bus, zero is 
written into the coefficient memory. When all 
the data busses have been tested fully using pseudo- 
random word sequences, the logical and arithmetic 
functions of the bit-slice ALU and then the 
operation of the hardware multipler and shifter 
are checked. The final test initialises a known 
digital filter with a step input and after a suitable 
interval checks the steady-state filter output. 

These test routines are stored in both the 
MPU PROM and the control store. When 
completed, the MPU transmits a message back to 
the system controller of the form; 

<8>0PlP2P3P4P5F6Pf 

In this example, device 8 reports that test 5 failed 
while all the others passed. The controller, with a 
knowledge of the self-test programming, can 
quickly determine the area of the fault. The 
testing takes about 20s to run because of the long 
pseudo-random sequences used but by checking 
the results locally as they are generated, a 
monitoring 'bottleneck' is avoided at the system 
controller. 

With the system hardware checked, 
attention must then be turned to verifying correct 
operation of the signal processing program. This is 
necessary for debugging programs and also as a 
safety net for programs which have got 'lost*. 
In the processor, all internal busses and the 
program counter were separately buffered and 
wired to pods so that a logic analyser could be 
simply connected. Thus, a real-time trace of 
program execution is instantly at hand. Built-in 
software performs a 'dump on request* of all 
COPAS internal registers into the coefficient 
memory and using GPIB commands, the contents 
of either coefficient memory or delay memory 
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can be transferred to the system controller for 
analysis. 

When a program is executing correctly, it 
is possible that an incorrect message on the GPIB 
could, for example, result in a branch to a non- 
existent routine. All messages are, therefore, 
checked for syntax and data types and if an error 
is detected a 'service request' is placed on the GPIB. 
This interrupts the system controller and alerts 
the user to an immediate problem. 

4. A Modular System using COPAS-2 

Each COPAS-2 processor is a single board 
measuring 380mm x 250mm with a power 
consumption of approximately 35W. In Fig. 7 
the board is shown complete with additional 
logic to assist in logic analysis and program 
development.* 

In a versatile digital audio processing system, 
means must be provided for routing audio signals 
between the processing units and input/output 
devices. The method chosen to achieve this is 
illustrated in the three bus system of Fig. 6. 



the experimental equipment 16 inputs were 
assigned to each device with a 128 time-slot 
multiplex, giving a maximum of eight devices 
in the system. Further expansion could be con- 
templated by repeating the system with more 
elaborate routing arrangements, although the 
complexity increases rapidly with the number of 
devices used. 

The system is supervised by the system 
controller; for the experimental work a Tetronix 
4052 graphics computer was used. Interactive 
programs were written in BASIC and present a 
'menu' to the user from which the desired 
functions can be selected. Appropriate numeric 
data is calculated or looked-up and assembled into 
GPIB commands using syntax rules which have 
been described in Ref.2. 

4.1. The Routing Processor 

The routing system to be described is based 
on techniques used in the digital telecommuni- 
cations industry ^ in which the route taken by 
audio data is determined by its position in a time- 
division multiplexed signal. A block diagram of 
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Fig. 6 - A modular signal processing system 



The A and B busses are time-division-multiplexed 
and all data transfers are achieved using the third 
bus - the GPIB. This address is also used by those 
devices which send or receive data from the routing 
processor to identify the appropriate time slots 
on the multiplexed data busses. The input/output 
ports in Fig. 6 connect to ADC/DACs and do not 
require control by the GPIB. As a result of this 
scheme, the system is completely modular and 
additional COPAS-2 units can be easily added up 
to the multiplex limit on the A & B busses. In 

* This unit was constructed by D.J. Marshall. 



the routing unit is given in Fig. 8 and its operation 
may be explained by reference to Fig. 9 simplified 
to a six channel unit. 

The incoming multiplex is easily generated 
by sequentially reading the outputs of each of the 
devices connected to the A bus. For a particular 
routing, an audio data sample must be stored, 
then forwarded to the outgoing multiplex in a 
different time slot. The store is termed the 
'crosspoint RAM' in Fig. 9 and the order of the 
outgoing multiplex is determined by the 'routing 
RAM'. In operation, data is sequentially written 
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Fig. 7 - The COP AS -2 signal processor. 
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Fig. 8 ' Routing processor block diagram, 

into the crosspoint RAM using a counter to to a pattern of read addresses {RA) stored in the 

provide write addresses {WA). The recorded data routing RAM. One input may be routed to many 
are read from the crosspoint RAM according outputs by repeating read addresses supplied to 
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Fig. 9 - Illustration of the operation of audio routing. 



the crosspoint RAM, though the total number of 
routings is Hnuted by the number of timeslots in 
the outgoing multiplex. 

In the experimental equipment a 128 
channel multiplex and 32 kHz sampling rate results 
in data rates on the A and B busses of about 4 Mbit/s 
and this is easily transmitted over the short 
distances between each device. The routing unit 
is built on a single card measuring 225mm x ISOmm 
and dissipates 4W. 

An important feature of the design is that 
the routing is determined by the contents of the 
routing RAM and this can be loaded at low data 
rates from the system controller using the GPIB, A 
microprocessor unit (MPU) is used to interpret 
high level messages transmitted from the system 
controller into appropriate address patterns in 
the routing RAM. These updates are made in the 
interval between successive clock periods in a 
synchronous fashion. For example, if device 4, 
output 7 is to be patched to device 6, input 3, as 
shown in Fig. 10 (I) the mnemonic (.P) is used 
and the resulting GPIB command sent to the 



routmg processor is, 

.P-<4>07 TO <6>I3$ 

In the experimental equipment, no special 
provision was made for 'glitch-free' switching of 
audio signals. However, the principle of such a 
matrix is demonstrated in Fig, 10, (1) - (5). This 
requires the fixed allocation of one matrix input 
and two matrix outputs to a hardware cross-fade 
unit (this function may be achieved within a 
COPAS module). The sequence of operations 
shown in Fig. 10 produces a smooth transition at 
the output when the source is switched between 
'old' and 'new' and requires a sequence of GPIB 
commands. 



<1) .P-<4>07 TO <6>I3| 

(2) .P-<4>07 TO <9>I1$ 
.P-<9>01 TO <6>I3$ 

(3) .P-<3>02 TO <9>I2$ 
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Fig. 10- Sequence of operations for 'glitch-free' switching. 



(4) Device 9 completes the cross-fade 



(5).P-<3>02 TO <6>I3f 

Similar techniques can be used when one 
source is switched between destinations; the 
controlled fade-out at one output and fade-in at 
the other output can be administered by a similar 
sequence of operations. These sequences could be 
automatically generated by the system controller 
and the user need not know that they are occurring. 



Additional commands are available to clear 
the routing RAM or to select a pre-programmed 
static routing stored in PROM in the routing 
processor. Since the design is modular, the 
physical location of each of the devices is not 
relevant to these routing operations being carried 
out. 



4.2. Input/Output Units 

Communication with ADCs and DACs is 
handled by dedicated input/output units. These 
units recognise a device address from the routing 
processor and allocate an input or output port 
to each of the 16 time-slots corresponding with 
that device. Parallel data is thereby transferred 
from an ADC or DAC to the routing processor 
via the multiplexed data busses. 

In addition, provision is made for a serial 
digital interface using two time-slots of the multi- 
plexed data busses. Data is encoded serially using 
a format which is receiving much support in the 
audio industry'. This is suitable for transmission 
of, for example, a stereo signal over distances of at 
least 40m with a wired link. 

Each of these units recognises a fixed device 
address and does not require interaction with the 
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system controller. 

4.3. Delay Unit 

The delay unit is one example of a wide 
range of special purpose processors that can be 
installed in this modular signal processing system; 
others could be look-up tables, signal generators, 
fast Fourier transformers, etc. A block diagram 
of the unit is shown in Fig. 11. It consists of a 
64K word memory based on dynamic 64K RAMs. 
A microprogrammed address generator provides a 
choice of delay programs with parameters 
transferred using the GPIB. Up to eight different 
microprograms can be selected and each handles 
the refresh of the dynamic RAMs automatically. 
Up to eight separate delay loops can be set up 
using the eight input/output ports and the circuitry 
is built on a single card measuring 225mm x 
150mm, and dissipates 7W. 

Two microprograms have been written for 
the unit. Each microinstruction has 24 bits and a 
special assembler was created for this format 



using AMD ASM. The first program sets up 
eight delay loops with start and finish 
addresses determined by the contents of the 
'delay pattern RAM'. Thus data presented at 
input port 3 will appear at output port 3 
after a specified interval. The second program 
executes a 'store and repeat' for the entire 
contents of the store. In this way, a 'trace' 
of digital data can be made lasting up to two 
seconds at a 32 kHz sampling rate. Start 
and finish addresses loaded into the delay 
pattern RAM determine those parts of the 
stored trace to be output. 



The unit can be 'patched' using the routing 
processor to provide long delays for COPAS 
programs while avoiding the processing overhead. 
It has been found to be invaluable for experiments 
in reverberation and other special effects. 

5. Conclusions 

An improved, high-performance digital audio 
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signal processor, COPAS-2 has been described, 
it is constructed and programmed to carry out 
real-time equalisation and dynamic range control 
to the highest quality standards. The modular 
design has made the taslt of interfacing with other 
similar units much easier and makes it highly suited 
to use in a digital mixing desk where many 
channels of audio processing are required. The 
design can be extended to increase its capabilities 
without significantly affecting the programming 
software that has so far been written. The 
important features, i.e. the internal architecture, 
hardware/software compromises, context switching 
and control via a standard bus have been found 
experimentally to be correctly balanced. COPAS-2 
is being applied in the design of a digital sound 
mixing desk being manufactured by the Neve 
group of companies. 
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